Percolation of unsatisfiability in finite dimensions 
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The satisfiability and optimization of finite-dimensional Boolean formulas are studied using per- 
colation theory, rare region arguments, and boundary effects. In contrast with mean-field results, 
there is no satisfiability transition, though there is a logical connectivity transition. In part of 
the disconnected phase, rare regions lead to a divergent running time for optimization algorithms. 
The thermodynamic ground state for the NP-hard two-dimensional maximum-satisfiability prob- 
lem is typically unique. These results have implications for the computational study of disordered 
materials. 



Complex problems with many degrees of freedom are 
of interest to both physicists and theoretical computer 
scientists. The overlap is especially strong between the 
physics of disordered materials and optimization prob- 
lems in the typical case. For example, there is a close 
correspondence between the ground states of Ising spin 
glasses, with up and down spins, and optimal assignments 
of Boolean variables, which can be true or false, in a logi- 
cal formula. This correspondence is more than superficial 
as both systems exhibit phase transitions in the struc- 
ture of minimal configurations and in the dynamics of 
the physical systems or optimization algorithms 0. Such 
connections lead to advances in the two fields. Combina- 
torial optimization algorithms from computer science are 
often employed to simulate disordered condensed matter 
systems |2|. Approaches from statistical physics, includ- 
ing techniques such as replica theory and concepts such as 
the thermodynamic limit and scaling, have proven useful 
in studying the running time algorithms and the struc- 
ture of solution space [3j . 

Motivated by work on mean field Boolean formulas and 
progress in understanding models of finite-dimensional 
disordered materials, we investigate ensembles of Boolean 
formulas whose graphs are two-dimensional. These for- 
mulas are composed by conjunctively joining logical 
clauses, with each clause formed using nearest neighbor 
variables. The optimization problem is to assign truth 
values so as to satisfy the maximum number of clauses in 
the formula. This is closely analogous to minimizing the 
number of broken bonds in an Ising spin glass 0. Us- 
ing ideas from statistical physics, including percolation 
and thermodynamic ground states, we find a transition 
in the structure of logically connected components and 
investigate the uniqueness of optimal assignments. 

Decomposing the problem into clusters of strongly 
connected components that contain contradictory cycles 
greatly reduces the running time of an exact optimiza- 
tion algorithm. These contradictory strongly-connected 
components (CSC's) need not percolate even though the 
clauses form percolative structures. In addition, the 
rapid convergence to a unique ground state as the size 
of the problem increases suggests that the problem is 
easy in the typical case, though it is classified as difficult 



in the worst case sense. Two central categories in this 
classification from computational complexity theory are 
P and NP decision (yes/no) problems Problems in 
P can be decided in time polynomial in the size of the 
problem description, while a proof of the answer for NP 
problems can be checked in polynomial time. NP-hard 
problems, a solution to which could be used to quickly 
solve any problem in NP, are believed to be solvable only 
in exponential time for the worst case realizations. It 
may well be that many NP-hard problems derived from 
physical systems, such as finding the ground state config- 
uration for the 2D spin-glass in a magnetic field |f|, are 
typically solvable in polynomial time. Our results sup- 
port this possibility. NP-hard problems with algorithms 
that typically take polynomial time on some problem sets 
are known |5j , but have not been extensively and directly 
studied for physical problems in finite dimensions. 

We consider finite-dimensional Boolean formulas Z of 
the form 



(1) 



where V is the logical OR operation, A is the logical 
AND operation, and {yf } are literals chosen from a set 
Y = {x\, . . . , a;jV) %i, ■ ■ ■ j 5jv} of N Boolean variables and 
their negations. The variables are identified with the 
vertices of a two-dimensional lattice. We specialize to 
clauses with K — 1 and K — 2. We form 2-clauses 
by choosing two neighboring variables and negating each 
variable with probability 1/2. The 1-clauses are single 
literals, with probability 1/2 of negation. A sample for- 
mula is depicted in Fig.^a). The ensemble is defined by 
parameters a and 7, respectively the ratios of the num- 
ber of 2-clauses to N and 1-clauses to N. The 2-clauses 
do not overlap and no two 1-clauses contain the same 
variable. Given a truth assignment Xi — > {T, F} for all 
Boolean variables, a clause is satisfied if one of the literals 
in the clause is T. If all clauses are satisfied, the formula 
Z is satisfied. Determining the existence of a satisfying 
truth assignment is the problem of satisfiability (SAT). 

The optimization of the number of satisfied clauses in 
Z can be mapped to determining the ground state of a 
spin glass in a heterogeneous field. This mapping trans- 
lates Boolean assignments Xi = {F, T} to spin variables 
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Si = {— 1, +1}. A bond energy Eg can be assigned to a 
clause (uq V y\) connecting variables Xi and Xj via |3( 

E t = j[l- A(y e )S t - Aiy^S, + A(y e Q )A(y{)S t S 3 ], (2) 

where A(y^) = 1 if y$ = Xi and A(i/ ) = -1 if = x t 
and similarly for j (nearest neighbor to i) replacing 
with 1. The total spin glass energy E is given by E — 
12eLi Ee- Any clause that is not satisfied costs an energy 
J; the existence of an E — ground state is equivalent 
to satisfiability of the Boolean formula. 

Resolution ,5| is a method that can be used to quickly 
decide SAT for K < 2. This procedure is equivalent to 
mapping each 2-clause to a pair of logical implications 
and searching for "contradictory cycles" (CCs). For ex- 
ample, the clause x\ V x 2 is equivalent to x\ — > X2 and 
x 2 — > x±. Clauses with K = 1 are replaced by a sin- 
gle implication, e.g., x\ becomes x\ — > x\. The Boolean 
formula can be represented by an implication digraph 
(i.e., directed graph) G = (Y, E) with 2N vertices and 
(2a + -f)N edges E. For a sample mapping, see Fig. ^b). 
The formula Z can not be satisfied if there is a CC, which 
is a path p in G that connects a variable to its negation 
and vice versa, i.e., p = {xt — * Xj . . . — > x% — > . . . — > Xi). 
For the formulas we consider, the existence of contradic- 
tory cycles can be decided in time linear in N 0. 




FIG. 1: (a) Example finite-dimensional Boolean formula. 
Each 2-clause in the formula is an edge, represented by two 
segments. Circles represent 1-clauses. Black segments or cir- 
cles indicate negated variables, while the lighter shaded seg- 
ments or circles represent variables that are not negated. The 
formula depicted is (xiV X3) A (12 V14) A (14 V17) A (3:2) A (x&). 
(b) A smallest unsatisfiable subgraph for the triangular lat- 
tice (left) and its digraph (right). The subgraph's formula is 

(xoVxi)/\(x VX2)^XlVX2)h(X2VXz)h{X2VX4)h(xz\/X4). A 

contradictory cycle (CC), in this digraph is X2 — * xo — > x\ — > 

X2 — > X-s — > Xi — > X2- 

We find that there are CCs for any a > (taking 
7 = 0), as N — > 00, in these finite-dimensional formu- 
las. Defining ai/ 2 (N) as the value of a for which 1/2 of 
the finite-dimensional N- variable formulas arc satisfiable, 
a i/2{N) — * as N — > 00 (see Fig- El) This crossover is 
coarse, in that the width of the crossover from low to high 
probability of satisfiability is proportional to 0^/2 (-^0, for 
large N. This to be contrasted with random mean-field 
K = 2 formulas, where for N — * 00, there is a sharp SAT 
to UNSAT phase transition (the probability that a for- 
mula is satisfiable is I for a < a c — 1 and for larger a.) 



These differences result from small CCs, which at small a 
are exponentially rare in the mean field case but appear 
with Poissonian statistics in the finite-dimensional case, 
where loops are more important. 




FIG. 2: (a) Plot of a 1 / 2 (N), the clause density at which 1/2 
of the graphs are satisfiable, as a function of lattice size N. 
Symbols indicate numerical results for 2SAT and l-in-2-SAT 
(Ising spin glass) on triangular and square lattices. Curves are 
analytic approximations found in a small subgraph expansion. 

The location of the SAT/UNSAT crossover can be 
computed by an expansion in a. Some subgraphs are 
"forcing", i.e., in all satisfying assignments one of the 
variables has a fixed truth value. The smallest unsatisfi- 
able graph is found by joining two contradictory forcing 
subgraphs. An example of this graph type is depicted 
in Fig. IHb). On the triangular lattice, these subgraphs 
have density p&{a) = 2%r + 0(a 7 ). The density of 
the simplest unsatisfiable graphs on the square lattice is 
pa (a) = frw + 0(a 9 ). In general, if the smallest unsatis- 
fiable subgraph has r bonds and density c r a r , the proba- 
bility of satisfiability is Psat(-^V) = (1 — c r a r ) N , to lowest 
order in a, giving ai/ 2 {N) w (c^ 1 ^ In 2)N~ 1 / r . We plot 
numerical results and analytic expansions for 0^/2 (-/V) in 
Fig. [21 which includes the next order analytic corrections 
in a (7-edged subgraphs with density ■^Y' ? + 0(a s ) on 
the triangular lattice and 9-edged subgraphs with density 
+ 0(a 10 ) on the square lattice). 

We also plot analytic estimates and numerical results 
for a 1/2 (N) for the l-in-2-SAT problem in Fig.^J While 
a clause in 2SAT (i.e., K = 2) is satisfied if either lit- 
eral is true, a clause is satisfied in l-in-2-SAT when ex- 
actly one literal in a clause is true. The l-in-2-SAT 
problem maps both to an Ising spin glass in the ab- 
sence of a magnetic field and to the two-color prob- 
lem The smallest unsatisfiable graphs are given 
by frustrated cycles, giving oti/ 2 {N) f» 3(iV/ ln2) -1 / 3 
and ai/ 2 (iV) « 2 5 / 4 (iV/ In 2)" 1 / 4 for the triangular and 
square lattices, respectively. 

Given the lack of a sharp SAT/UNSAT transition, due 
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to the existence of small unsatisfiable graphs, we have 
investigated the percolation of large unsatisfiable graphs 
as a phase transition. We study these graphs within the 
context of MAXSAT, which is the problem of minimiz- 
ing the number of unsatisfied clauses. In two dimensions, 
the determination of the ground state for the Ising spin 
glass (or MAX-l-in-2-SAT) is in P 6], while determining 
the ground state for MAXSAT with K = 2 is NP-hard, 
even for planar graphs. We studied the CSCs, sets of 
literals for which any two literals are connected by a di- 
rected path in the implication digraph and which contain 
a CC. We find that the probability of having a spanning 
CSC has a transition that becomes sharper with increas- 
ing N, with a critical value for a of as — 1.8245(5) on 
the triangular lattice. The cluster size distribution n(s) 
at criticality behaves as n(s) ~ s~ T , with r = 2.02(5). 
The scaling of the probability for a spanning CSC near 
as gives a correlation length exponent of v = 1.32(3). 
These values are consistent with the 2D values for stan- 
dard percolation, where r = 187/91 and v = 4/3 [9j. 

Percolation of paths in the implication digraph can be 
related to connectivity percolation 01 ■ This is done by 
projecting part of the implication digraph on 2N literals 
onto an undirected graph of TV variables. This "trimmed" 
projection has the same statistics as standard connectiv- 
ity percolation with edge probability p = 2p — p 2 , where 
for our finite-dimensional case, p = a/2z, with z — 4 
(z = 6) for square (triangular) lattices, if overlapping 
clauses are allowed. (On the triangular lattice with over- 
lapping clauses, we find as = 1.887(2).) In mean field, 
the percolation of paths containing a contradiction coin- 
cides with the SAT/UNSAT transition. This is not the 
case in finite dimensions. But given this type of connec- 
tion and studies of percolation of directed edges in finite 
dimensions 11], it is not surprising that CSC percolation 
appears to be in the same universality class as standard 
connectivity percolation. 

The decomposition of the graphs into CSCs speeds up 
exact search algorithms for MAXSAT. Here, we apply 
this decomposition to estimate running times of such an 
algorithm. We used a MAXSAT code 12] that first finds 
a heuristic bound to the solution and then applies an ex- 
act Davis-Putnam-Loveland-Logemann (DPLL) search. 
The running time measure t is the number of "back- 
tracks" that are executed while partially exploring the 
tree of all possible assignments. Each CSC cluster can 
be loaded into the algorithm individually 0]. The sum 
of the unsatisfied clauses from each cluster gives the min- 
imal number of unsatisfied clauses for the entire formula. 
When a < as, the distribution of sizes of the CSCs, is ex- 
ponentially decaying in the cluster size, n(s) ~ e~ s / s «( Q ), 
with S£ oc £ d oc (as — a)~ dv . If we plot the median num- 
ber of backtracks for each cluster, we find that the median 
running time of the DPLL-type algorithm scales expo- 
nentially with the cluster size, t*(s) ~ e s / s ^ a \ When 
S(:(a) < s T (a), the median running time for a sample, 



T*(L), is bounded by a multiple of the system volume, 
T*(L) ~ L 2 . However, when sj > s T , T*(L) diverges 
more rapidly, with an estimate for the largest cluster 
size in a finite sample giving T*(L) ~ L 2St ^ s ^. The 
mean running time, T(L), diverges exponentially with L. 
The transition between the linear and superlincar median 
time behaviors defines ac < as v i a S((«g) = s T (ac)- 
Fig. El shows convolutions of the cluster size distribution 
n(s) and the median time t*(s,a) as a function of size. 
The change from negative to positive slope on the semi- 
log plot gives ac ~ 1.3 for the DPLL code we use. This 
slowing down of the algorithmic dynamics is similar to 
that for the physical dynamics of random magnets [l^j 
and is reminiscent of the change from the easy-SAT to 
hard-SAT phases in random graphs 0. 



■- oc=1.5 
- oc=1.4 




FIG. 3: Convolution of the median number of backtracks t* (s) 
with the CSC cluster size distribution n(s), where s is the 
cluster mass. Note that a < as for these curves. 

Despite the divergence of the running times for the ex- 
haustive DPLL-type algorithms, we might expect that 
the ground states could be found in time proportional to 
the system volume in the typical case, even above the 
CSC percolation transition. Assuming that the droplet 
picture describes these finite-dimensional spin glasses, 
the presence of the magnetic field destroys the spin glass 
phase ^^] and the correlations are finite-ranged (though 
in 2SAT there are some correlations in the external 
fields.) So while the CSCs percolate, the effects of frus- 
tration remain localized beyond some length scale. This 
picture also implies one unique thermodynamic state. If 
this is the case there may be a way to develop a new 
algorithm to deal with the local frustrated bonds, either 
by solving subsystems and joining the solutions together 
to form the whole system or by a more clever heuristic 
algorithm. We leave this as a conjecture and simply test 
for uniqueness. 

To test whether the ground state is unique, we study 
the effect of boundary conditions, similar to studies of the 
Ising spin glass [ltij . By comparing ground states for a 
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system of linear size L and an expanded system of linear 
size L' (each with free boundaries), one can determine if 
the ground state is unique or not from the sensitivity to 
boundary conditions. If the solutions in the common sub- 
system of linear size w become fixed as L and L' diverge, 
a unique ground state exists in the thermodynamic limit. 
Note that the ground state must be unique for a < as, 
as the logical structure of the graph does not percolate. 

Since the ± J spin glass with magnetic field (equivalent 
to optimal assignments for MAX2SAT) has many degen- 
erate ground states, we study the weighted MAX2SAT 
(WMAXSAT) question, where the degeneracy is broken, 
to be able directly to compare ground state solutions. 
Each clause has an associated weight, chosen uniformly 
in the interval [0,1), and the optimization problem is 
now to minimize the sum of the weights of the unsatis- 
fied clauses. We also introduce 1-clauses, with the same 
weight distribution. The addition of 1-clauses lowers as, 
allowing us to study a larger range system of system sizes, 
as the graphs are sparser. 
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FIG. 4: Log-linear plot of P(2L, L, w) for a = 1.7 and 7 = 0.2 
for weighted MAX2SAT. The bound on backtracks is B. The 
lines are exponential fits for the w — 2, 4, B — 5 x 10 6 data. 

We estimate the quantity P(2L, L,w), the probability 
that there is a change in the central area of area w 2 when 
the system size expands from L to 2L |l6| . by sampling 
from the WMAXSAT ensemble. To be able to complete 
the simulations, we impose an upper limit B on the num- 
ber of backtracks in the DPLL and report P for a range 
of B. The points in Fig. 21 with w — 2, are well fit by an 
exponential in L, in the limit of large B. (A power law 
fit gives an exponent less than —2, which is inconsistent 
with a fractal domain wall picture for a model with two 
states 16].) The w = 4 data is also well described by 
an exponential with the same slope. For a — 1.7 and 
7 = 0.2, we estimate a correlation length of £ = 2.5 ±0.3. 
The exponential approach to a unique state holds for all 
a and 7 that we explored. 

Working within the concepts and the algorithms for 



spin glasses and other disordered materials, we have stud- 
ied the problem of optimal satisfaction of Boolean formu- 
las. There is no thermodynamic SAT to UNSAT transi- 
tion, due to the finite density of small unsatisfiable for- 
mulas. There is a percolation transition, however, in the 
logical structure of the formulas as the clause density is 
increased, that is apparently in the class of standard con- 
nectivity percolation. Below this transition, we use rare 
region arguments to predict a transition in the mean run- 
ning time of an optimization algorithm. We find that the 
ground state is unique even in the high clause density 
regime. This uniqueness suggests that the MAX2SAT 
problem can be solved "locally" by studying subsamples 
larger than the correlation length and patching subsolu- 
tions together (though for large correlation lengths, rare 
regions might again dominate the running time.) This 
general approach in turn has potential applications to 
algorithms for studying spin glasses and other random 
magnets. This project was supported in part by the Na- 
tional Science Foundation through grant DMR-0109164. 
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